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^ ! Abstract 



This work proposes a sampling-based (non-intrusive) approach within the 
context of low-rank separated representations to tackle the issue of curse-of- 
dimensionality associated with the solution of models, e.g., PDEs/ODEs, 
with high- dimensional random inputs. Under some conditions discussed in 
details, the number of random realizations of the solution, required for a suc- 
cessful approximation, grows linearly with respect to the number of random 
inputs. The construction of the separated representation is achieved via a 
regularized alternating least-squares regression, together with an error indi- 
cator to estimate model parameters. The computational complexity of such a 
construction is quadratic in the number of random inputs. The performance 
of the method is investigated through its application to three numerical ex- 
amples including two ODE problems with high- dimensional random inputs. 
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1. Introduction 

In most practical situations, the behavior and evolution of complex sys- 
tems are known only with limited certainty. This is mainly due to the lack of 
knowledge about the governing physical laws or limited information regarding 
their operating conditions and input parameters, e.g., fuel or material prop- 
erties. A reliable computer simulation of such systems, therefore, requires a 
systematic representation of uncertainties and quantification of their impact 
on quantities of interest. 

In the probabilistic framework, uncertainties are represented using ran- 
dom variables and processes characterized by, for instance, the available ex- 
perimental data or expert opinion. A major task is then to quantify the 
dependence of quantities of interest - also random variables or processes - 
on these random inputs. Development of efficient numerical tools for the 
computation of such mappings has been a research subject over the past few 
decades and has become an emerging field of study more recently. 

Statistical tools such as Monte Carlo sampling and its variations have 
been widely used for this purpose. To address the low convergence rate of 
these methods, stochastic basis expansions, for instance, in chaos polyno- 
mials [32, 31, 76] and multivariate numerical integration/interpolation, such 
as those based on the sparse grid collocation [69, 52, 75], have been re- 
cently proposed. While proven efficient both numerically and analytically 
on numerous problems in engineering and science, these techniques may face 
difficulties when input uncertainties are characterized by a large number of 
independent random variables (i.e., high-dimensional random inputs) [75, 74]. 
Specifically, the computational complexity of these methods, in their original 
forms, grows exponentially fast with respect to the number of input random 
variables: an issue known as the curse- of- dimensionality. The reason for 
such a fast growth is the tensor product construction of multi-dimensional 
bases (in polynomial chaos methods) or quadrature rules (in sparse grid col- 
location approaches) from one- dimensional bases or quadrature rules, respec- 
tively. Such tensorizations, therefore, impose an explicit dependence on the 
input dimensionality. Instead, a number of recent techniques have been pro- 
posed that exploit, for instance, low-rank or sparsity structures of quantities 
of interest in order to reduce this exponential complexity growth, see, e.g., 
[23, 70, 57, 11, 49, 10, 12, 27, 58, 25]. However, effective treatment of the 
curse-of-dimensionality still remains an open problem. 

Alternatively, in the present study, we propose an approach that con- 
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structs a separated representation of the solution of interest. Specifically, let 
u{y) with y{uj) = {yi{u), . . . ,yd{uj)) : M'', u ^ and G N, be a 

stochastic function defined on a suitable probability space {Q,J^,V). Then 
u{y) admits a separated representation if, for some (small) r G N and accu- 
racy e > 0, 



where the univariate functions {u[{yk)}f.=i, I = 1, . . . ,r, are unknown and 
{si}, I = l,...,r, are some normalization constants. When the separation 
rank r is independent of d, the approximation (1) may be obtained with a 
computational complexity that is linear in d, [8, 9, 7, 25], hence drastically re- 
ducing the curse-of-dimensionality. When the approximation (1) is achieved 
by a small r, then u{y) is said to admit a low-rank separated representation. 

Separated representations, also known as canonical decompositions (CAN- 
DECOMP) or parallel factor analysis (PARAFAC), have been first intro- 
duced by Hitchcock [39] to represent a multi-way tensor as a finite sum of 
rank-one tensors. Consequently, they have been extensively applied to several 
areas including image compression and classification [66, 30], telecommunica- 
tion [68, 67, 48], neuroscience [54, 29, 6], chemometrics [14], and data mining 



Separated representations have been recently used for the reduced order 
solution of deterministic PDEs in solid and fluid mechanics, where they are 
often called proper generalized decompositions (PGDs), see, e.g., [45, 46, 4, 
5, 16, 56, 15, 47, 18, 17] and the references therein. 

In the context of stochastic problems, an expansion based on the sepa- 
ration of spatial and random variables has been proposed in [58, 59]. While 
this approach significantly reduces the computational complexity of full- 
order polynomial chaos (PC) approximations, it suffers from the curse-of- 
dimensionality for high-dimensional random inputs. A separated representa- 
tion of the form (1) for the solution of PDEs and ODEs with high-dimensional 
random inputs has been first studied in [26, 24, 25]. Similar representations, 
although with different solution strategies, have been later investigated in 



The current separated approximation of models with high-dimensional 
random inputs are intrusive, that is, one has to rewrite deterministic solvers 
in order to propagate the uncertainty. However, for many large-scale complex 
systems, it is desirable to develop non-intrusive solvers, where determinis- 
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(1) 



[8, 9, 44, 42, 71, 35, 1, 7, 2, 43]. 



[60, 41, 28]. 
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tic codes are treated as black box. Such a construction is the purpose of 
the present study. Specifically, we here extend the alternating least-squares 
regression approach of [7] to problems with high- dimensional random in- 
puts. To enhance the stability and accuracy of this technique, we propose a 
Tikhonov regularization of the regression problem along with an error indica- 
tor for the selection of two main parameters of the separated representation. 
Under some conditions to be discussed in Section 3.4, the number of random 
realizations of the solution, required for a successful approximation, grows 
linearly with respect to d. Furthermore, the computational complexity of the 
alternating least-squares regression approach is quadratic in d. 

The rest of this paper is structured as follows. In Section 2, we intro- 
duce the problem of interest in an abstract form. Following that, in Sec- 
tion 3, we discuss the separated representations in more details and describe 
their non-intrusive construction using an alternating least-squares regression. 
Additionally, we discuss the computational complexity of this non-intrusive 
method. In Section 4, we propose a regularization approach together with an 
error indicator to derive a stopping criterion and, hence, avoid over-fitting the 
solution realizations. Finally, in Section 5, we demonstrate the performance 
and efficiency of the proposed approach on a manufactured function as well 
as two ODE problems with high-dimensional random inputs. The first ODE 
problem is a linear elliptic equation with random diffusion coefficient. The 
second one is concerned with a hydrogen oxidation problem where reaction 
rate constants and species thermodynamics are uncertain. 

2. Problem setup 

Let (fi, J^, V) be a complete probability space on which the random inputs 
are defined. Here VL is the set of elementary events, J-" C 2^ is the cx-algebra 
of events, and P : J-" — J- [0, 1] is a probability measure function defined on 
J-". For the sake of demonstration, we consider a generic stochastic ordinary 
differential equation (ODE) 

A{t,y{uj)-u) = {), G [0,T] X 1], (2) 

that holds V-a.s. for u eVL. Appropriate initial condition and forcing func- 
tion are also considered. Here, A denotes the ordinary differential operator, 
t G [0, T], T > 0, is the time variable, and y{<jj) = {yi{uj), . . . , : ^ — )■ 

Mf^, d G N, represents the vector of random variables defining the input un- 
certainty. We further assume that the components of the random vector y{u}) 
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are statistically independent and identically distributed (i.i.d.) according to 
a probability density function p{yk) : F C R — M>o, k = 1, . . . ,d. 
Our goal is to estimate the possibly non-linear solution 



uit, y) := u{t, y{uj)) = u{t, 2/1(0;), ... , yd{u)) : [0, T] x T'^ ^ M (3) 



of (2) in a non-intrusive fashion, i.e., with the assumption that the determin- 
istic solver of (2) is a black box. This entails the evaluation of u{t, y) at a set 
of random or deterministic realizations of y and subsequently the construc- 
tion of an approximation u{t, y) based on an interpolation, a regression, or 
a discrete projection scheme. 

In the present study, we consider the random sampling of y, similar to 
the standard Monte Carlo simulation. We denote by D a set of inde- 
pendent realizations y^^\ j = 1, . . . , N, drawn according to the probability 
density function p{y) = Y[k=i PiVk)^ the corresponding solution realiza- 
tions u{t, y^^^), 



Using the data set D, we then extend the regression approach of [7] to 
stably construct u{t,y) in the separated form (1). 

For the interest of description and without loss of generality, we hence- 
forth restrict our analyses to a fixed instance of t and adopt the short notation 
u{y) for the solution of interest. We next introduce the separated representa- 
tion (1) in more details and subsequently discuss our non- intrusive approach 
to construct and regularize such an approximation. 

3. Separated representations 

To motivate the separated representation of u{y), we first briefly de- 
scribe the standard multivariate approximation of u{y) using polynomial 
chaos (PC) expansions [32, 76]. 

Let {'ipak{yk)}k=i be the set of univariate spectral polynomials of degree 
afc G No := N U {0} orthonormal with respect to the probability density 
function p{yk), i.e.. 



D = {{y^^^;u{t,y'-^^))} 



N 



(4) 




A tensorization of the univariate basis {'ipak{yk)}k=iy i-^-, 

i^ciy) = ^/'ai(yi)^a2(2/2) • --i^aAyd), " e No, 



(5) 
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with Nq = {{ai, . . . ,ad) : dk ^ ^o}, forms an orthonormal basis in 
L2{T'^,V). Then, for u{y) G L2{T'^,V), the infinite series 

u{y) = ^ s^tpc^iy) (6) 

with coefficients 

So = / u{y)i)c.{y)p{y)dy (7) 

converges in the mean-squares sense. The series (6) is referred to as the 
polynomial chaos (PC) expansion. 

We highlight a number of properties of the PC representation in (6). 
Firstly, as observed in (5), the multivariate basis functions ipct{y) are of sep- 
arated form with respect to random inputs t/k- Secondly, the basis functions 
ipaiy) are a priori selected based on the probability density function p{yk) of 
yk- Therefore, the PC expansion (6) has a linear structure with respect to the 
coefficients s^. Thirdly, due to the particular tensor product construction (5) 
of ijjctiy), the cardinality of a finite degree basis {ipa{y)} with, for instance, 
a e Nq'^ = {a eNq : ||q:||i < p, p G No} grows exponentially with d. This 
fast growth leads to the issue of curse-of-dimensionality as the computational 
complexity of estimating the coefficients grows exponentially fast. 

Alternatively, the separated representation of a function u{y) is a decom- 
position of the form 

r 

u{y) = Yl ■ ■ ■ ^^diVd) + siy). (8) 

1=1 

Here, the separation rankr and univariate /actors/functions {Wfc(yfc)}fc=i, 
/ = 1, . . . , r, are not fixed a priori and are outcomes of the approximation such 
that a prescribed accuracy ||£:|| < e is achieved. The scalars {s;}, I = 1, . . . ,r, 
may be viewed as some positive normalization constants which we will specify 
later. We note that, by construction, the separated representation (8) is a 
non-linear expansion and when u admits a low separation rank r, a fast decay 
of the error with respect to r may be achieved. Due to its separated form 
with respect to the variables yk, the non-linear approximation (8) may be 
computed using multi-linear approaches such as the alternating least-squares 
(ALS) scheme described in Section 3.2. As we shall see in Section 3.4, for 
situations where r is independent of d, the separated representation (8) may 
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be obtained with a number of samples and computational cost that are, 
respectively, linear and quadratic in d. In practice, however, if r weakly 
depends on d, e.g., r ~ 0(\ogd), the above dependencies will be near-linear 
and near-quadratic, respectively. 

Remark 1 (Connection with sparse PC expansions). Due to the tensor prod- 
uct construction of PC basis functions ipociy) in (5), a sparse PC expansion 
is of a low-rank separated form. By sparse PC expansions, we mean repre- 
sentations of the form (6) in which many of the coefficients Sa, ct G Nq''^, 
are negligible. The relevance of the sparse PC approximation of PDEs with 
random inputs has been discussed in several work including [11, 20, 27]. 

3.1. A non-intrusive construction 

Given the realizations {y^-'^ in D, we formulate a discrete (pseudo) inner 
product between u{y), v{y) : F"^ — )■ M as 

which induces the (pseudo) norm 

\\u\\d = {u,u)]^'^ . (9) 

Let 



r 




k = 1 

Ur= {^siu[{yi)---u\{yd): : T -> M, G M>o, i^^" 

1=1 ' ■ ■ 

be the space of separated real functions in d dimensions and with separation 
rank r. The non-intrusive separated approximation of u{y) may then be cast 
into a least-squares regression problem of the form 

= arg min — (10) 

Due to the tensor product structure of W^, the optimization problem (10) 
is non-linear. Furthermore, the separation rank r is not known a priori and 
has to be estimated. A discrete formulation of (10), as we shall describe in 
Section 3.2, may be solved using non-linear programming techniques such 
as the damped Gauss- Newton, see, e.g., [61]. These non-linear schemes, 
however, suffer from the curse-of-dimensionality and their current use is thus 
limited to only low-dimensional problems. Alternatively, we here adopt a 
multi-linear optimization technique known as the alternating least-squares 
(ALS) to solve (10). We next describe the details of such an approach. 
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3.2. Alternating least-squares (ALS) approach 

In the ALS method, we construct a sequence of one-dimensional least- 
squares regression problems to solve for the univariate functions {u''f,{yk)}, 
I = 1, . . . ,r, along a direction k. In doing so, we freeze variables along all 
other directions at their current approximation. We then alternate over all 
directions and repeat the regression process. Once the algorithm converges 
after a number of directional sweeps is performed, the separation rank r is 
increased if the residual norm \\u — UrWo is above a prescribed accuracy e. 
The ALS approach can be thought of as a non-linear generalization of the 
block Gauss-Seidel method as we solve for one set of variables at a time while 
fixing all the others. We note that such a multi-linear approximation is pos- 
sible, particularly, due to the separated form of Ur with respect to ykS. 

Linear least-squares regression along y^. For a given k G {1, . . . ,d} and some 
r > 1, assume that scalars {si} and functions {u\{yi)}, i k, I = 1, . . . ,r, are 
given and fixed. We then seek the unknown univariate functions {u''i^{yk)}, 
/ = 1, . . . , r, from the restriction of (10) to the direction k. That is, 

WkiVk)} = arg min \\u - UrWn (11) 
{"fcfefc)} 

= arg min - V U(y(^)) - V nl(y^^))si TT u((yJ^M , 

/ = 1, . . . , r, which is in the form of a standard one-dimensional regression of 
scattered data. 

A discrete formulation of (11) can then be obtained by expanding each 
(unknown) univariate function u^kiVk) (or ^{{yk)) into some finite-dimensional 
basis of L2{T,V). In the present study, we choose spectral polynomials or- 
thogonal with respect to p{yk) - the probability density function of yk - for 
this purpose. We note that implicit in such a selection is the assumption 
that each factor u^kiVk) depends smoothly on yk- 

Spectral polynomial expansion of univariate factors u[{yk). Let {'ipakiyk)} be 
the set of spectral polynomials of degree ak ^ M & Nq associated with p{yk) 
as introduced in Section 3. We then approximate 

M 

uiiVk) ^ J^^^^k'Po.M, / = l,...,r, (12) 

afe=0 
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in which, for the interest of simphcity, we assumed the maximum degree of 
polynomial expansion, M, is the same for all separation terms and in all 
directions k. However, this assumption may be relaxed with no technical 
difficulty. In fact, for situations where the relative importance of the random 
inputs t/k is available, an anisotropic selection of M may considerably improve 
the convergence of the approach. 

Given the finite-dimensional approximation oiu^i^{yk) in (12), the problem 
(11) reduces to the discrete least-squares optimization 

Cfc = argmin ||m — Mj-ll^ (13) 

^ N f r / M \ 

to compute the expansion coefficients := (cq, ■ ■ ■ , c\j, ■ ■ ■ , Cq, ■ ■ • , c\jY ^ 
]^r{M+i) a^iQj-^g direction k. Setting the derivative of ||m — "Urllii with respect 
to Cfc to zero, we arrive at the normal equation 

AlA.cu = Alu, (14) 

to solve for c^.. Here, the matrix Ak E has a column block- 

structure Afc = [Al ... A^], where each column-block G M^^(*'^+i), 
/ = 1, . . . , r, is given by 

A{{j,au + 1) = sii;^,{y^^)\[u\{yW (15) 

for j = 1, . . . ,N and = 0, . . . , M. Moreover, the data vector u G 
contains realizations of u{y), i.e., u(j) = u{y^^^), j = 1, . . . , N. 

Having computed from (14), we then update each si such that si -f- 
■Sz||Mfc||D and accordingly set Ck ^ / \\u''f,\\ d . We do so to conveniently 
compare the relative contribution of each separation term. Additionally, as 
described in [8, 9], such a normalization of q may be used to control the loss 
of significant digits in the evaluation of the separated representation. 

The alternation to yk+i and separation rank increase. Once the factors 
{u^khjk)}^ / = l,...,r, are computed based on c^, as discussed above, the 
algorithm alternates to direction /c + 1 in order to compute {u^k+iiVk+i)}) 
I = l,...,r, using recent estimates of {u\{yi)}i^^, I = l,...,r. The pro- 
cess will continue until all factors {u''f,{yk)}f.^i, I = I, . . . ,r, are updated. In 
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practice, we initialize the the factors {u\{yk)}i=i, I = 1, . . . ,r, randomly (or 
according to some other guess); therefore, the sweeps through directions k 
must be performed multiple times until the residual norm — Mr||D does not 
decrease further. Notice that, by the ALS construction, — Mr||z) monoton- 
ically decreases or remains constant throughout the iterations. For a given 
separation rank r, if the residual norm ||n — UrWo at the end of the ALS 
updates is not smaller than a target accuracy e, then r is increased to r + 1 
and the algorithm is repeated. The overall non-intrusive ALS procedure is 
summarized in Algorithm 1. 



• Input : Data set D = {^(^y^^^;u{y^^^))} polynomial orders M, 

and accuracy e 

• Output: r, {c[j,^}fc=i, and Si for = 0, . . . , M and / = 1, . . . ,r 

• Set r = 1; (randomly) initialize {cj^^.} (or equivalently {uKyk)}) 
while \\u — Uj-Wd > e do 

while II n — UrWo decreases much do 
for A; ^ 1 to d do 

• Fix {c[j,.}, i k, and solve for {c[j^} using (14) 

• Update si ^ si \\u[\\d and c^^ ^ /II 4 lb 
end 

end 

• Set r = r + 1; (randomly) initialize {c^^.} (or equivalently 
i'^kiUk)}) and, thus, Sr ior k = 1, . . . ,d 

end 

Algorithm 1: Summary of the ALS method for the non-intrusive con- 
struction of separated representations. 

Remark 2. While in the preceding descriptions and in Algorithm 1 we require 
e as an input to our construction, in practice e may not be known a priori and 
a rough estimate of e may lead to under-fitting or over-fitting. We address 
this issue in Section 4, where we instead select an "optimal" value of r (and 
M) based on an error indicator. Therefore, no stopping criterion e (per se) 
is required. 
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Remark 3. For each k, as the coefficients {cj^^.} corresponding to different 
values of I are coupled in the linear systems (14), the factors {u[{yk)}, I = 
l,...,r, may change upon the introduction of the new separation term (to 
increase r). Alternatively, following the work in [4, 13], a greedy approach 
may he applied to perform updates only on the new separation term while 
freezing all the previous ones. The investigation of this approach is, however, 
beyond the scope of the present study. 

Notation Alert. For a simpler presentation, unless required, we hereafter 
eliminate the subscripts k from the linear systems in (14) and their exten- 
sions. 

3.3. Solution statistics 

Given the spectral coefficients {c[j }, one can estimate the statistics of u 
by integrating Ur analytically, e.g., for the mean and variance, numerically via 
quadrature integration, e.g., for the kurtosis, or via random sampling, e.g., 
for the probability density function. Specifically, the integral-form statistics, 
e.g., moments of u, can be obtained as a sequence of one-dimensional inte- 
grals, thanks to the separated form of Ur- For instance, the estimate of the 
mean and second moment of u is given by 

d 



Ur{y)p{y)dy = J^s; JJ ( / u[{yk)pk{yk)dyk 
-'"^ 1=1 k=l ^ 

= i^'Wo (16) 



(=1 k=l 



and 



„ r r d / „ 

E[uf\ = / ul{y)p{y)dy = y2y2siSi'Y\i u[{yk)ul{yk)pk{yk)dyk 

r r d / M \ 

= 5:E^.».'n . (17) 

l=\ V=\ k=\ \cife=0 / 

respectively, where we used the orthonormality of the spectral basis {^^a^ (z/fc) }• 
In general, the m-th moments of Ur can be similarly computed based on the 
expectations of the product of m spectral polynomials {^a^ (l/fc)}, which may 
be computed exactly using Gaussian quadrature integration. 

Alternatively, one may randomly draw independent realizations {tir-d/*^-''')} 
of u^. to generate Monte Carlo estimates of the statistics of u^- 
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3.4- Computational cost of ALS and dependence of N on d 

Here we elaborate on the computational complexity of a full sweep of the 
ALS and quantify its dependence on d and A^. For this purpose, we consider 
the asymptotic (but relevant) case of ^ rM. Each factor u\{yk) can be 
evaluated with complexity 0{M) for a given y^i^\ Therefore, the matrices A 
in (14) are generated with a cost of 0{rdMN). Each normal equation (14) 
may be solved using the Cholesky or LU decomposition of A^A with an 
asymptotic complexity 0{r'^M'^N) that includes forming the normal matrix 
A'^A and the right-hand-side vector A'^u as well as computing the solution 
c. We here assumed that r^M^ <C r'^M'^N. 

Following [7], to alternate from direction k io k + 1, the Ak+i matrices 
may be updated from Ak by multiplying products Ili^fc ^K?/!"'^) (15) by 

'"fc(l/fc"''')/'"L+i(?/fc+i)- This requires 0{rdMN) cost for a full sweep instead of 
0{rd'^MN) if Ak+i was set directly. Adding up the complexities of forming 
and solving the normal equations (14) for all directions k = l,...,d, we 
arrive at 

0{r^dM^N) (18) 

for the cost of a full sweep of the ALS iteration. Given that the number A^ 
of solution realizations generally depends on d, the estimate in (18) is in fact 
super-linear in d. We next discuss the dependence of A^ on d. 

Dependence of N on d. As opposed to the PC expansions (6) where the 
number of unknown coefficients Sq, grows exponentially with d, the total 
number of unknown coefficients in the separated representation (8) is rd{M+ 
1), which depends linearly on d, assuming that the separation rank r is 
independent of d. We, therefore, expect a linear dependence of A^ on d, i.e., 

A^ ~ 0{rdM), (19) 

for a successful separated approximation. Plugging (19) in (18) leads to an 
overall computational complexity 

Oir^d^M'"^) 

for a full sweep of the ALS iteration, which is quadratic in d. We highlight 
that for situations where evaluating the solution of the forward model is 
significantly expensive this overall cost is reasonable. 
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Remark 4. The estimate provided in (19) is based on the rule-of thumb 
that a successful regression problem requires a number of data points that 
is multiple of the number of unknowns. However, a more precise estimate 
of this dependence may be needed. In particular, the number of unknown 
coefficients in the one- dimensional regression problems (13) is r{M + 1), 
hence independent of d. Moreover, the same samples used to compute the 
coefficients in (14) are also utilized to compute c^+i, and so on. Therefore, 
the linear dependence of N on d may be relaxed. On the other hand, in the 
absence of the regularization approach of Section 4-1, N may need to grow 
like 0{r''M^), ( > 1, to keep the condition number of the matrices small; 
therefore, the linear dependence of N on r and M in (19) may be optimistic. 

4. Preventing over-fitting: regularization and selection of r and M 

Similar to other regression methods, the non-intrusive separated repre- 
sentation (8) may over-fit the data, especially when the number of solution 
samples, A^, is small. By the ALS construction, the residual norm ||u — Ur^o 
decreases or remains unchanged throughout the one-dimensional updates as 
well as when the separation rank r is increased. As u is not exactly separable 
or may be noisy, excessive reduction of — Mr||D results in over- fitting which 
may be accompanied by poor approximations of u where data is not available. 
Additionally, an unnecessarily large degree M of the spectral approximation 
of factors {u\{yk)} may also lead to over- fitting in the one-dimensional re- 
gression problems (13). Numerically, these two over-fitting scenarios give 
rise to ill-conditioned matrices A in (14), hence, inaccurate solutions c and 
subsequent ALS updates. 

The naive approach to reduce, or even prevent, over- fitting is to choose 
sma// values for r and M . However, this may lead to inaccurate (under- fitted) 
solution Ur- A more formal strategy is based on the concept of regularization, 
where additional smoothness constraints are enforced on Uj. while allowing 
for moderate values of r and M. 

In the present study, we examine a Tikhonov regularization approach to 
reduce the over-fitting for given choices of r and M. Subsequently, in Section 
4.2, we develop a perturbation-based error indicator based on the regularized 
version of problem (14) in order to identify the values of r and M leading to 
stable and, at the same time, accurate solutions. 
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4.1. Tikhonov regularization 

One classical way to regularize the solution of regression - or more gener- 
ally inverse - problems is through the use of Tikhonov regularization, where a 
smoothness promoting penalty term is added to the regression cost function, 
e.g., see [36]. While the non-intrusive construction of the separated repre- 
sentation is originally a non-linear problem, due to the multi-linear attribute 
of the ALS, the Tikhonov regularization can be naturally applied to the lin- 
ear regression problems (13). Specifically, consider the matrix formulation of 
(13), i.e., 

c = argmin — IIAc — ullg. (20) 

ta ^ ^ II 112 \ / 

In Tikhonov regularization, a smoothness penalty of the form ||Lc||2, L G 
j^r{M+i)xr{M+i) ^ jg g^fQ^-ggfj qu the solutiou c usiug the method of Lagrange 
multipliers, i.e., 

Cx = argmin -— \\Ac — uW^ + A^||lyc||2. (21) 

Here, L and A G M>o are called the Tikhonov matrix and regularization 
parameter, respectively. The regularization parameter A balances the relative 

111 1 1 2 

contributions of the mismatch function \\Ac— u\\^ and the penalty term 
\\Lc\\l 

It is straightforward to show that the solution to the Tikhonov-regularized 
problem (21) can be computed from the modified normal equation 

{A^A + X^L^L) Cx = A^u. (22) 

The effectiveness of the Tikhonov regularization depends on the choices 
of L and A. We next describe the approaches we followed in this study for 
the selection of these quantities. 

4.I.I. Choice of Tikhonov matrix L 

The standard form of the Tikhonov matrix is when L = I, the iden- 
tity matrix. For linear regression problems with orthonormal basis, ||-rc||2 
is essentially the second moment of the approximate solution. Therefore, 
large deviations in regions where data is highly sparse or not available will 
be penalized in the reconstruction. However, as demonstrated in the nu- 
merical example of Section 5.3, the choice of L = /, or more precisely L = 
Im+1 ®diag{si, . . . , s^) to account for the normalization of c, may lead to an 



14 



inadequate regularization of (20). Here Im+i and ® denote the identity ma- 
trix of size M+1 and the Kronecker product, respectively. The reason for this 
inadequacy is because no orthogonahty condition among the separation terms 
is enforced and ||(/m+i ® diag{si, . . . , Sr-))cfc||2 = JJi=i ^ E[ul] ig- 

nores the contributions of factors along directions i ^ k. Therefore, in the 
present study, we derive the regularization matrix L such that 



where, here, u,. refers to the separated approximation of u at the iteration in 
which the regularization is performed. Following (17), it is straightforward 
that E[u^] = Be, where the (/, /')-th block of the symmetric, positive- 
definite matrix B e W'^M+i)xr{M+i) gj^^^ 



Using (23) and assuming that IE[ii^] > 0, the Tikhonov matrix L can be 
set from the Cholesky decomposition of B, i.e.. 



While we here choose L according to the condition (23), other construc- 
tions, for instance, based on the gradient of u^. with respect to y may also 
be effective in promoting smoothness, see, e.g., [22]. 

Remark 5. Assuming that E[m^] > 0, the Tikhonov matrix L in (24) is 
invertible. This property will be exploited in the analysis of Section 4-2. 

4- 1-2. Selection of regularization parameter X: Generalized Cross Validation 



Selecting a suitable value for the regularization parameter A is essential to 
the success of the Tikhonov regularization. A substantially underestimated 
A results in non-smooth solutions due to the over-fitting. On the other hand, 
a significantly overestimated A leads to highly biased and smooth solutions. 
In both cases, the approximation of Ur may be inaccurate. Several statistical 
methods including Unbiased Predictive Risk Estimator, [51, 21], Morozov's 
Discrepancy Principle, [55], L-curve, [19, 38], and Generalized Cross Vali- 
dation (GCV), [73, 33], have been proposed to estimate A. Assuming the 



Lc\\l = E[u% 



(23) 




B = L^L. 



(24) 



(GCV) 
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samples u are noise-free, the first two approaches require an accurate knowl- 
edge of the second moment of the mismatch u — Ur- Such information is not 
generally available, especially that the second moment of u — Ur depends on 
r and M and varies throughout the ALS iterations. The last two methods 
above, however, do not require such information and may be applied. 

In the present study, we employ the GCV technique to select A. GCV 
is a modification of the classical leave- one- out Cross Validation (CV) where, 
for a fixed A, the regression is performed using — 1 sample points and 
the resulting approximation is tested - or validated - against the remaining 
one. Such calibration and validation are performed on the N possible leave- 
one-out scenarios and the validation error is averaged. The process is then 
repeated for several values of A and the CV selection of A is the one associated 
with the smallest average validation error. Specifically, let 

= argmin \\A^^^c-u^^l + \^\\Lc\\l (25) 



be the solution of the Tikhonov-regularized problem (20) for some value of 
A and when the realization u{j) is excluded from the dataset D. Then the 
CV choice of A is given by 

1 ^ 2 

A = argmin CV(A) = ttYI " ' (^6) 



which is equivalent to 

. 1 ,^ 

A 

" i — /V 

i=i 

see, e.g., [73]. Here, the hat matrix Hx = A{A^ A + X"^ L)~^ maps the 
vector of the realizations u to their predicted values based on the separated 
representation. Notice that the evaluation of the CV function in (27) is eas- 
ier than that of the (26) as, for each A, only one calibration is performed in 
(27). The diagonal entries Hx{j,j) depend on the ordering of the rows in A, 
hence the ordering of realizations {y^-'^- To address this drawback. Gener- 
alized Cross Validation (GCV) was introduced in [73, 33], where Hx{j,j) is 
approximated by the average of the trace of Hx, that is Hx{j,j) ~ -^tr{Hx)- 
Therefore, in GCV the selection of A is given by 

A = argmin GCV{X) = " (28) 
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Remark 6. We note that the range of X for which the GCV (or CV) function 
is evaluated to find the minimum can be set based on the (generalized) singular 
values of A. For further information on the selection of this range as well 
as the details on how GCV is efficiently evaluated in practice, we refer to 
[19, 72, 37] . 

4-2. Perturbation- based error indicator for selection of r and M 

By construction, the ALS approach monotonically decreases the residual 
norm ||m — throughout the iterations with both fixed separation rank 
r and when r is increased. However, due to the over-fitting issue discussed 
at the beginning of Section 4, a large r may result in a solution Ur that is 
far from u where samples are not available. The regularization approach of 
Section 4.1 mitigates this issue to a great extent, however, it does not provide 
a stopping criterion for the separation rank increase. Similarly, given a fixed 
number N of solution realizations, there is the need to identify a value for 
M, the degree of spectral basis, which does not lead to a highly under-fitted 
or over-fitted approximation. 

To this end, we propose the selection of r and M using a perturbation 
bound on the sensitivity of the regularized solution of (22) to the mismatch 
between u and Ur in (8). In particular, for a given M, the separation rank r 
is increased until the sensitivity of the solution to (22) across all directions 
k increases. This process is repeated for different values of M and the pair 
(r, M) is chosen such that the sensitivities of the associated regularized so- 
lutions (largest across directions k = 1, . . . ,d) is the minimum among other 
pairs (r, M) . More precisely, let 

e{y^'^) = u{y'^^^)-ur{y'^^^), j = l,...,N, (29) 

be the realization of the separated representation error e in (8) where Ur is the 
separated representation of u obtained with a fixed separation rank r, spectral 
polynomial degree M, but with a sufficiently large number of samples N^. 3> 
A^. For a moment assume that the vector of separated representation errors 
£ G with e(j) = e(j/*--'^), j = 1,. . . ,N, is known. Then the sensitivity 
of the Tikhonov solution Cx of (22) under this error, or perturbation of the 
realizations of Ur, may be bounded as follows. 

Proposition 1 (Perturbation bound). Let c\ and c.\ be solutions to the 
problem (21) associated with data vectors u and u — e, respectively, where 
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the perturbation vector e E with e{j) = e{y^^^), j = 1, . . . , N , is given in 
(29). Then, 

\\c.h - Wc.h ■ ^ ^ 

Proof. The proof is a straightforward simphfication of Theorem 5.5.1 in [36]. 
We only remind that, according to Remark 5, exists. □ 

Notice that the bound in (30) depends on ||£||2 which is not readily avail- 
able in practice. However, under some assumptions, the expected value of 
\\s\W may be estimated. In particular, we assume that e{y^^^), j = 1, . . . , N, 
can be approximated by realizations of some uncorrelated random variables 
with zero mean and variance a^. Then, following [34], o"^ may be estimated 

by 

Consequently, we have 



-2 II^^A '"'112 /o-iN 



E [||e||^] = NSl (32) 

We then define an error indicator of the form 

EI = „" „ \ 33 

which is similar to the perturbation bound in (30), except that we replaced 
II ell 2 with its expected value Naf. 

In the numerical examples of Section 5, we use the error indicator EI in 
(30) to estimate a pair (r, M) that does not lead to over-fitting or under- 
fitting. Specifically, for a fixed M, we monitor EI throughout the last d 
iterations of the ALS updates immediately before r is increased. We then 
take EI^^ as the largest EI among these d indicators. Our selection of 
the separation rank r and polynomial degree M, hence the stopping crite- 
rion of the separated representation, is the pair (r, M) corresponding to the 
smallest EI^^. Notice that the EI in (33) implicitly depends on r through 
A~^, ||Iy~^||2, a\, and ||ca||2- Roughly speaking, when r is too small or un- 
necessarily large, EI is large due to a\ or ||27~^||2 being large, respectively. 
For values of r close to the optimal separation rank (e.g., one resulting in 
minimum mean-squares error), however, these latter quantities are moder- 
ate. This is why the pair (r, M) obtained via EI may avoid under-fitting 
and over-fitting in the separated approximation. 
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Remark 7. We call the quantity in (33) an error indicator as opposed to 
an error estimator as (33) does not account for the bias introduced in the 
estimation of c due to the Tikhonov regularization. Additionally, due to 
the iterative construction of ALS, the matrices A are affected by the errors 
in the estimation of c coefficients from the previous iterations. While the 
effect of such errors may be modeled in the form of perturbations of A, we 
ignore this effect in the present study. Even so, as we shall see in numerical 
results of Section 5, the error indicator in (33) successfully identifies the 
pair (r, M) that avoids over-fitting and at the same time leads to accurate 
separated representation. 

We next investigate the performance of the non-intrusive separated rep- 
resentation together with the proposed regularization strategy on a manu- 
factured function and two ODE models subject to high-dimensional random 
inputs. 

5. Numerical Experiments 

5.1. A manufactured function 

Here we consider the separated representation of the function 

""(yi, • • • ,yio) = So + SiV^3(yi)?/'3(?/2) + S2^/'2(2/3) + S3^/'2(2/8) + S4^/'3(y9)+£^, (34) 

where {yk}^^=i are i.i.d standard Gaussian random variables and is the 
Hermite polynomial of degree z, normalized to have variance one. The noise e 
is a zero mean Gaussian random variable with standard deviation 0.0005 and 
is independent from {yk}^=i- The coefficients {si}f^i are set to sq = 0.55, 
si = S2 = -^2/4, S3 = -72/4, and S4 = -1/10. This setting leads 

to K[u] = 0.55 and Var[M] = 0.76 for the noise-free u. 

The purpose of this test case is to verify the selection of the separation 
rank r and spectral polynomial degree M via the error indicator of Section 

4.2. In particular, we expect that r < 5 as the the separated representation 
compresses the noise-free u with a separation rank that cannot exceed the 
number of orthogonal terms in (34), i.e., 5. This is due to the construction of 
separated representations where no orthogonality condition among the sepa- 
rated terms is assumed. Additionally, an accurate separated approximation 
of u requires an order M > 3 for the spectral expansion of the factors in (8). 

Figures 1(a) and 1(b) depict the selection of (r, M) based on the error 
indicator of Section 4.2 and by minimizing the standard deviation error. 
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respectively. In most cases the error indicator selects the reasonable values 
r = 3 and M = 3. As is observed from Figs. 1(c) and 1(d), except for 
the small sample size of iV = 200, the solution computed based on the 
error indicator leads to accuracies that are close to those generated based on 
minimizing the standard deviation error. 

5.2. Linear elliptic stochastic equation 

We consider the elliptic stochastic equation 

u{0,y) = u{l,y) = 0, 
where a{x, y) is the stochastic diffusion coefficient given by 

d 

a{x, y) = a{x) + cx^ ^ ^/lk(t)k{x)yk- 

k=l 

Here, d{x) is the mean of a{x, y) and aa is a coefficient to control the 
variability of a{x,y). The random variables {yk}k=i assumed to be in- 
dependent and uniformly distributed on [—1,1]. Additionally, {7j}fc=i and 
{0fe(a;)}fc=i are, respectively, d largest eigenvalues and the corresponding 
eigenfunctions of the covariance function 

Caa iXi,X2) = CXp 

where Ic is the correlation length of a{x, y) and is set to Ic = 1/14. This will 
lead to d = 40 dominant eigenvalues in the spectrum of Caa- Furthermore, 
we consider a = 0.1 and choose aa = 0.021 to guarantee the positivity of 
a{x, y) over V. The realization of u in (35) are obtained by a Finite Element 
solver with quadratic elements. The mean and standard deviation of the 
solution at X = 0.5 are the quantities of interest. The accuracy of the non- 
intrusive separated representation is compared against those of the Monte 
Carlo simulation and sparse grid stochastic collocation techniques. To obtain 
a reference solution, we compute the 3rd order PC expansion of solution at 
X = 0.5 using level I = 5 stochastic collocation with the Clenshaw-Curtis rule 
(see, e.g., [75] for more information on the stochastic collocation approach 
and the Clenshaw-Curtis rule). 



{xi - X2) 
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Number of samples N 




Number of samples N 



(b) 





Number of samples N 



(d) 



Figure 1: Evaluating the performance of the error indicator of Section 4.2 in estimating 
the separation rank r and polynomial degree M. (a) (r, M) estimated based on the error 
indicator of Section 4.2; (b) (r, M) that minimize the standard deviation error (computed 
based on the reference solution); (c) and (d) Mean and standard deviation error, respec- 
tively, when r and M are estimated using the error indicator and based on the minimum 
standard deviation error. (Separation rank r (n); Polynomial degree M (o); (r, M) esti- 
mated based on the error indicator (— -H— ); r and M that minimize the standard deviation 
error (— -e- — )). 
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Given the uniform distribution of y^, we expand the univariate factors 
{'^kiVk)} in a Legendre polynomial basis. Figure 2(a) displays the estimates 
of the separation rank r and polynomial degree M obtained for sample sizes 
N e {80, 200, 300, 400, 600, 800, 1000, 1500} using the error indicator EI dis- 
cussed in Section 4.2. As can be observed from this figure, r = 1 is selected 
for different values of A^, thus confirming that n(0.5, y) admits a low-rank 
separated representation. However, the estimates of the polynomial degree 
M depend on A^. In Fig. 2(b), for each A^, we obtain the pair (r, M) such 
that the resulting error in the standard deviation of the solution is minimum. 
Notice that such a selection of (r, M) requires the availability of the refer- 
ence solution. Figures 2(c) and 2(d) illustrate the convergence of the solution 
mean and standard deviation corresponding to (r, M) selected according to 
the above two strategies. While the estimates of (r, M) based on the er- 
ror indicator are not precisely the same as those minimizing the standard 
deviation error, the resulting accuracies in Figs. 2(c) and 2(d) are compa- 
rable. Moreover, the -E /-based selection of (r, M) leads to stable separated 
representations of the solution. 

Figures 3(a) and 3(b) compare the convergence of the mean and standard 
deviation of m(0.5, y) obtained by the separated representation, the standard 
Monte Carlo simulation, as well as the (isotropic) sparse grid stochastic col- 
location with the Clenshaw-Curtis abscissas. As the estimates obtained by 
the separated representation and the Monte Carlo simulation are sample de- 
pendent, two sets of independent realizations are considered. Figure 3(b), in 
particular, demonstrates a faster convergence and more accurate estimates 
of the solution standard deviation achieved by the separated representation 
approach. Additionally, as the construction of the non-intrusive separated 
representation is based on random sampling of the solution, the approxima- 
tion refinement may be achieved by incorporating as many additional samples 
as can be afforded. In the sparse grid stochastic collocation with nested ab- 
scissas, however, the number of additional solution samples is dictated by 
the underlying quadrature rule. 

5.3. The hydrogen oxidation problem 

As a third test case, we study the problem of hydrogen oxidation under 
the Supercritical Water (SCW) condition. SCW condition provides a means 
for fast and complete oxidation of organics without toxic formation. The 
chemical mechanism associated with the hydrogen oxidation may involve 
numerous uncertain parameters such as coefficients describing elementary 
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Figure 2: Evaluating the performance of the error indicator of Section 4.2 in estimating 
the separation rank r and polynomial degree M. (a) {r,M) estimated based on the 
error indicator of Section 4.2; (b) r and M that minimize the standard deviation error 
(computed based on the reference solution); (c) and (d) Mean and standard deviation 
error, respectively, when r and M are estimated using the error indicator and based on 
the minimum standard deviation error. (Separation rank r (□); Polynomial degree M (o); 
(r, M) estimated based on the error indicator (— -H — ); (r, M) that minimize the standard 
deviation error (— ^ — )). 
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Figure 3: Comparison of relative mean and standard deviation errors of the solution at a; = 
0.5 for the separated representation, Monte Carlo simulation, and sparse grid stochastic 
collocation with Clenshaw-Curtis abscissas (First set: solid line; Second set: dashed line), 
(a) Relative error in mean and (b) Relative error in standard deviation. (Monte Carlo 
simulation (o); Stochastic collocation (□); Separated representation(o); decay rate 

(Thin continuous line )). 
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reactions and thermodynamic properties of species, and has been the subject 
of several numerical studies in the area of uncertainty quantification [62, 65, 
63, 64, 50, 53, 3]. 

In the present work, we focus on the reduced mechanism of [62] consist- 
ing of 8 reversible elementary reactions and 7 species: OH , H, H2O, H2, 
O2, HO2, and H202- We assume that forward reaction rate constants kf^j 
and species enthalpies of formation '^H'j^ (associated with OH, H, H2O, 
HO2, and H2O2) are independent uncertain parameters, thus giving rise 
to an uncertainty space of dimensionality c? = 13. In particular, following 
[62, 65, 63], we model fc/j as independent lognormal random variables with 
prescribed medians kfj and such that Proh [kfj/(j < kfj < kfj(j] = 1 — 6. 
The so-called uncertainty factors Q are given in Table 1, and we here set 
6 = 0.004. The medians k/j are modeled by the Arrhenius equation k/j = 
AjT^^e"'^"'^/'^^, where R is the universal gas constant, T is the temperature, 
and parameters Aj, Eaj, and rij are derived from experimental data, see Ta- 
ble 1. The realizations of kfj with the above descriptions may be obtained 
from 

(log C ■ ^ 
^ yjj > J = i,...,8, 

in which {yj}^^^ are independent standard Gaussian random variables with 
cumulative distribution function (CDF) $. Moreover, we characterize the un- 
certainty in the species enthalpies of formation Aifj^ as independent Gaus- 
sian random variables with experimental mean values and standard devi- 
ations reported in Table 2. Other model parameters including entropies, 
internal energy, and initial conditions are considered to be deterministic. 

Perfectly stirred reactor (PSR). We process the oxidation in a perfectly stirred 
reactor (PSR), where the combustion is assumed to be spatially homoge- 
neous as a consequence of high diffusion rates and forced turbulent mixing. 
The PSR (isochoric here) is modeled by the conservation of mass, energy, 
and species, and we here use the CHEMKIN code [40] to solve the govern- 
ing equations over a 10 sec time interval. The deterministic initial ther- 
modynamic conditions are T = 823 kelvin and P = 246 bar, with initial 
species concentration Ch^ = 0.481 x lO'^ = 0.243 x lO'^ ^ , and 

CH2O = 0.99 X 10~^ For further information about the details of this 
problem, we refer the interested reader to [62]. 
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Tabic 1: Reduced mechanism of hydrogen oxidation [62]. Aj-. Pre-exponential coefficient, 
Uj: Temperature exponent, Ea.j/R'. Activation energy over universal gas constant, Q: 
Uncertainty factor. 



J 


Elementary reaction 


A, 






Ea,j/R 





1 


OH + H i — y H2O 


1.620^^ 


-14 





75 


3.16 


2 


H2 + OH < — > H2O + H 


1.024^- 


f8 


1.6 


1660 


1.26 


3 


H + 02< — > HO2 


1.481E^ 


-12 


0.6 





1.58 


4 


HO2 + HO2 i — > H2O2 + O2 


1.620E^ 


-12 





775 


1.41 


5 


H2O2 + 0H i — > H2O + HO2 







670 


1.58 


6 


H2O2 + H ^ HO2 + H2 


1.686^^ 


-12 





1890 


2.00 


7 


H2O2 < — >0H + 0H 


3.000^^ 


-14 





24400 


3.16 


8 


OH + H2O i — > H2O + O2 


2.891^^ 


-13 





-250 


3.16 



Table 2: Means and standard deviations of the species enthalpies of formation Ai79 i [62]. 



Species 


H 


OH 


H2O 


H2O2 


HO2 


index (z) 


1 


2 


3 


4 


5 




52.10 


9.3 


-57.80 


-32.53 


3.00 


2ai 


0.01 


0.2 


0.01 


0.07 


0.5 



5.3.1. Uncertainty quantification of OH concentration 

We apply the proposed non-intrusive separated approximation to com- 
pute the OH concentration as well as its mean and standard deviation. In 
order to assess the quality of our approximations, we generate an order p = 4 
Hermite PC reference solution whose coefficients are obtained using least- 
squares regression with N = 100, 000 realizations of OH concentration. In 
particular, we investigate the effect of two choices of L as well as the per- 
formance of the proposed error indicator for the selection of separation rank 
r and the Hermite polynomial degrees M. Additionally, we compare the 
accuracy of the solutions obtained using separated representation and the 
regression-based PC for small values of A^. 

Figure 4 compares the relative error in estimating the standard devia- 
tion of OH concentration at t = 2.2 sec with and without the Tikhonov 
regularization. For the regularized case, the Tikhonov matrix L of Section 
4.1.1 as well as L = Im+i ® diag{si, . . . , Sr) were used. As can be observed 
from Fig. 4, for large separation ranks r, the method over-fits the data. 
More importantly, the regularized approximation with the Tikhonov matrix 
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Figure 4: Regularized and non-regularized estimation of solution standard deviation as a 
function of the relative residual \\u — Ur\\D/\\u\\D {N = 200, t = 2.2 sec, and M = 2). For 
separation ranks r > 2 where over- fitting occurs, the Tikhonov matrix of Section 4.1.1 is 
considerably more effective than regularizing with L = Im+i 'Si diag{si, . . . , s^)- (Regu- 
larized with L defined in Section 4.1.1 (□); Regularized with L = Im+i ® diag{si, . . . , s^) 
(o); Non-regularized case(o)). 

L of Section 4.1.1 considerably reduces the over-fitting, as a result of which 
higher accuracies are achieved compared to a non-regularized case or when 
L = Im+1 ® diag{si, . . . , Sr) is applied. We refer to Section 4.1.1 for an 
explanation of this observation. 

The performance of the error indicator of Section 4.2 in estimating the 
optimal r and M is illustrated in Fig. 5 for the case of = 200 and 
t = 2.2 sec. Here, optimal r and M refer to those that correspond to smallest 
standard deviation error. In Fig. 5(a), we show the error indicator values 
as a function of r while fixing M = 2. In this case, the error indicator 
achieves its minimum at r = 2, which happens to be the separation rank 
corresponding to the minimum standard deviation error. We stress that the 
standard deviation errors were generated based on the reference values, while 
the error indicator does not require the knowledge of the reference solution. 
In Fig. 5(b), we show a similar analysis except that we fix r = 2 and let M 
vary. 

Figure 6 depicts the error indicator-based estimates of r and M for sepa- 
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Figure 5: Selection of the separation rank r and spectral polynomial degree M {N = 200, 
t = 2.2 sec) using the error indicator of Section 4.2. (a) Standard deviation error and the 
error indicator vs. r for M = 2 and (b) Standard deviation error and the error indicator 
vs. M for r = 2. Here optimal r and M refer to those that correspond to smallest standard 
deviation error. (Standard deviation error (□); Error indicator (o)). 



rated representation of OH concentration as a function of time. We observe 
that for most time instances these estimates are r = 1 and M = 2, respec- 
tively. Notice that these values depend on the number and particular sets 
of realizations used to reconstruct the solution. 

We next compare the accuracy of the separated representation and the 
regression-based PC expansion in estimating the mean and standard devia- 
tion of OH concentration, see Figs. 7 and 8. To achieve a stable approx- 
imation, the latter approach requires about = 650 realizations of OH 
concentration for a total degree p = 3 Hermite PC expansion. The accuracy 
of the p = 2 PC expansion was found to be significantly below those of the 
separated representation; therefore, that case was not considered further. As 
can be observed from Fig. 8, the two approaches achieve similar accuracies, 
in terms of maximum relative errors over the analysis time interval, while 
the separated representation requires only about half as many number of 
realizations of OH concentration. 

6. Conclusion 

This work introduced a non-intrusive algorithm based on separated rep- 
resentations to approximate model solutions with high- dimensional random 
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Figure 6: Values of separation rank r and spectral polynomial degree M {N = 200) 
obtained from the error indicator of Section 4.2. (r (□); M (o)). 
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Figure 7: The OH concentration as a function of time. (PC-based reference solution 

constructed with TV = 100,000 ( ); PC regression with N = 650 (•); Separated 

representation with N = 300 (□)). 



29 




123456789 10 123456789 10 



Time (sec) Time (sec) 

(a) (b) 

Figure 8: Comparison of separated representation and PC regression in estimation of mean 
and standard deviation of OH concentration, (a) Relative error in mean and (b) Relative 
error in standard deviation. Separated representation with N = 300 achieves a similar 
maximum error (over time t) as that of the PC regression. (PC regression with N = 650 
(o); Separated representation with N = 300 (□); Separated representation with N ~ 200 
(o))- 



inputs. Separated representations may be thought of as generalizations of 
matrix Singular Value Decomposition (SVD) to higher order tensors and 
multivariate functions. In particular, a multivariate function is decomposed 
into a linear sum of unknown functions that are separated with respect to the 
inputs. In this study, a regression approach was presented to stably compute 
the separated representation of high- dimensional stochastic functions using 
their randomly generated realizations. When the number of separated func- 
tions in the representation (known as the separation rank) is independent of 
the number d of random inputs, a successful approximation may be achieved 
with a number of solution realizations that depends linearly on d. The com- 
putational complexity of such an approximation grows quadratically in d. 
Therefore, the proposed framework may drastically reduce the issue of curse- 
of-dimensionality, a bottleneck for uncertainty quantification of systems with 
high- dimensional random inputs. One of the challenges for regression-based 
techniques is the issue of over-fitting, which may lead to numerical instabilies 
and inaccurate approximations. To tackle this issue for our construction, we 
adopted a Tikhonov regularization approach in which the Tikhonov matrix 
was designed to penalize large values of the second moment of the approxi- 
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mate solution. The regularization parameter was estimated using the Gener- 
ahzed Cross Vahdation (GCV) approach. Furthermore, a perturbation-based 
error indicator was derived to select the parameters of the separated repre- 
sentation to further prevent over-fitting (as well as under- fitting) . 

The performance of the proposed method was explored through its ap- 
plication to the reconstruction of a manufactured function as well as the 
numerical solution of two ODE problems with high- dimensional random in- 
puts. The first ODE problem discussed an elliptic differential equation in 
which the diffusion coefficient was a function of d = 40 random variables. 
It was demonstrated, numerically, that the proposed non-intrusive approach 
outperformed the standard sparse grid stochastic collocation, a widely used 
technique for approximating high-dimensional functions. The second ODE 
example focused on quantifying the effect of parametric uncertainties in a 
hydrogen oxidation problem. The advantage of the proposed approach com- 
pared to the regression-based polynomial chaos expansion was observed nu- 
merically. 

The present construction of separated representation was for scalar- valued 
stochastic functions. Extensions to vector-valued cases is the subject of an 
ongoing work. Such an extension is envisioned to exploit the low-rank struc- 
ture of solution (if exists) in the physical/temporal variables to further reduce 
the number of required solution realizations. 
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